|
In mathematics, in particular probability theory and related fields, the softmax function, or normalized exponential,〔 is a generalization of the logistic function that "squashes" a -dimensional vector of arbitrary real values to a -dimensional vector of real values in the range (0, 1) that add up to 1. The function is given by : for ''j'' = 1, ..., ''K''. The softmax function is the gradient-log-normalizer of the categorical probability distribution. For this reason, the softmax function is used in various probabilistic multiclass classification methods including multinomial logistic regression, multiclass linear discriminant analysis, naive Bayes classifiers and artificial neural networks.〔ai-faq (What is a softmax activation function? )〕 Specifically, in multinomial logistic regression and linear discriminant analysis, the input to the function is the result of distinct linear functions, and the predicted probability for the 'th class given a sample vector is: : This can be seen as the composition of linear functions and the softmax function (where denotes the inner product of and ). == Artificial neural networks == In neural network simulations, the softmax function is often implemented at the final layer of a network used for classification. Such networks are then trained under a log loss (or cross-entropy) regime, giving a non-linear variant of multinomial logistic regression. Since the function maps a vector and a specific index ''i'' to a real value, the derivative needs to take the index into account: : Here, the Kronecker delta is used for simplicity (cf. the derivative of a sigmoid function, being expressed via the function itself). See Multinomial logit for a probability model which uses the softmax activation function. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Softmax function」の詳細全文を読む スポンサード リンク
|